Zero-Shot Learning of Language Models for Describing Human Actions Based on Semantic Compositionality of Actions

نویسندگان

  • Hideki Asoh
  • Ichiro Kobayashi
چکیده

We propose a novel framework for zero-shot learning of topic-dependent language models, which enables the learning of language models corresponding to specific topics for which no language data is available. To realize zeroshot learning, we exploit the semantic compositionality of the target topics. Complex topics are normally composed of several elementary semantic components. We found that the language model that corresponds to a particular topic can be approximated with a linear combination of language models corresponding to elementary components of the target topics. On the basis of the findings, we propose simple methods of zero-shot learning. To confirm the effectiveness of the proposed framework, we apply the methods to the problem of generating natural language descriptions of short Kinect videos of simple human actions.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Alternative Semantic Representations for Zero-Shot Human Action Recognition

A proper semantic representation for encoding side information is key to the success of zero-shot learning. In this paper, we explore two alternative semantic representations especially for zero-shot human action recognition: textual descriptions of human actions and deep features extracted from still images relevant to human actions. Such side information are accessible on Web with little cost...

متن کامل

Multi-Label Zero-Shot Human Action Recognition via Joint Latent Embedding

Human action recognition refers to automatic recognizing human actions from a video clip, which is one of the most challenging tasks in computer vision. Due to the fact that annotating video data is laborious and timeconsuming, most of the existing works in human action recognition are limited to a number of small scale benchmark datasets where there are a small number of video clips associated...

متن کامل

A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment

We tackle a task where an agent learns to navigate in a 2D maze-like environment called XWORLD. In each session, the agent perceives a sequence of raw-pixel frames, a natural language command issued by a teacher, and a set of rewards. The agent learns the teacher’s language from scratch in a grounded and compositional manner, such that after training it is able to correctly execute zero-shot co...

متن کامل

Transductive Multi-label Zero-shot Learning

Zero-shot learning has received increasing interest as a means to alleviate the often prohibitive expense of annotating training data for large scale recognition problems. These methods have achieved great success via learning intermediate semantic representations in the form of attributes and more recently, semantic word vectors. However, they have thus far been constrained to the single-label...

متن کامل

Book Review: "Learning Strategy Instruction in the Language Classroom: Issues and Implementation"

Language learning strategies, “the techniques or devices which a learner may use to acquire knowledge” (Rubin, 1975, p. 43) or more pertinently “complex, dynamic thoughts and actions, selected and used by learners with some degree of consciousness in specific contexts” (Oxford, 2017, p. 48), have been widely researched and discussed for more than forty years since the mid-1970s. Shifting the fo...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014